Minimizing Testing Overheads in Database Migration Lifecycle

نویسندگان

  • Sangameshwar Patil
  • Sasanka Roy
  • John Augustine
  • Amanda Redlich
  • Sachin Lodha
  • Harrick M. Vin
  • Anand Deshpande
  • Mangesh Gharote
  • Ankit Mehrotra
چکیده

As part of their information management lifecycle, organizations periodically face the important and inevitable task of migrating databases from one software and hardware platform to another. Apart from ensuring accuracy and consistency, information managers of such organizations need to minimize the costs associated with the database migration process. Some of the migration tasks such as identification of databasedependent applications and the process of data replication are primarily governed by the technology used and have relatively predictable costs. On the other hand, the cost of testing and verification in the migration life-cycle depends significantly on the planning and dominates other costs. We refer to this problem of optimizing testing cost as Database Migration Problem (DBMP). DBMP is challenging for enterprises because large number of databases and applications are affected and a variety of constraints force the migration to spread over multiple phases (or migration waves, as they are typically called in the industry parlance). After each migration wave, significant overheads arise for ensuring data integrity and testing the applications for correctness. We ∗ Corresponding author. Email: [email protected] † Currently at Chennai Mathematical Institute, India. Email: [email protected] ‡Currently at Nanyang Technological University, Singapore. Email: [email protected] §Dept. of Mathematics, Massachusetts Inst. of Tech., USA. Email: [email protected] ¶ Currently unaffiliated with TCS. Email: [email protected] ‖ Currently unaffiliated with TCS. Email: [email protected] International Conference on Management of Data COMAD 2010, Nagpur, India, December 8–10, 2010 c ©Computer Society of India, 2010 have observed that the industry practice for migration planning is based on the experience and intuition of a few software architects and administrators. This often results in delayed migration schedules and spiraling costs. However, a careful partitioning of databases into waves can lower the testing overheads and result in significant financial savings of several hundred thousand dollars. Surprisingly we did not find any literature that addresses this problem either. Hence, in this paper, we focus on minimizing the testing overheads of the data migration life-cycle. We begin by showing that DBMP is NP-hard. Then, based on our real-life experience, we formulate DBMP so that it is amenable to formal, rigorous analysis and provide algorithms that cater to different scenarios, likely in practice. For small problem instances, we provide an optimal solution using integer linear programming. For larger instances, we formulate DBMP as a hyper-graph partitioning problem and use the wellknown hMETIS tool for solving it. hMETIS provides good solutions quickly, but violates some of the constraints that are important in industry, for instance the total size of databases packed in a wave cannot exceed the maximum wave size limit. Finally, to tackle such scenarios, we propose a new algorithm WAVE-FIT. Using experimental evaluation, we show that WAVEFIT provides solutions comparable to hMETIS, in reasonable time without violating any constraint.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Partitioning strategy for OODB

An effective strategy for distributing data across multiple disks is crucial to achieving good performance in a parallel objectoriented database management system. During query processing, a large amount of data need to be processed and transferred among the processing nodes in the system. A good data placement strategy should be able to reduce the communication overheads, and, at the same time...

متن کامل

Scaling transactional workloads on the cloud

In this paper, we address the problem of transparently scaling out transactional (OLTP) workloads on relational databases, to support database-as-a-service in cloud computing environment. The primary challenges in supporting such workloads include choosing how to partition the data across a large number of machines, minimizing the number of distributed transactions, providing high data availabi...

متن کامل

The Relationship between Lifecycle and Idiosyncratic Volatility with Emphasis on Fundamental and Information Uncertainty of Firms listed on the TSE

According to the importance and the increasing trend of idiosyncratic volatility in recent years, the study of factors affecting idiosyncratic volatility is one of the important issues in financial markets. So, the purpose of this study is to investigate the relationship between lifecycle and idiosyncratic volatility with emphasis on fundamental and information uncertainty. In this regard, 152 ...

متن کامل

Adaptive partitioning and indexing for raw data querying

Traditional database management systems approach to data analytics assumes that the input would be loaded within the DBMS, and then queried upon. However, data analytics depend on the interaction with the data analyst and as data collections grow larger and larger, data loading acts as a bottleneck and it incurs significant data-to-query delay. In this paper, we examine the NoDB paradigm, which...

متن کامل

A Practical Approach to Software Continuous Delivery Focused on Application Lifecycle Management

To deliver quality software continuously is a challenge for many organizations. It is due to factors such as configuration management, source code control, peer-review, delivery planning, audits, compliance, continuous integration, testing, deployments, dependency management, databases migration, creation and management of testing and production environments, traceability and data post-fact, in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010